A Tree Projection Algorithm for Generation of Frequent Item Sets
نویسندگان
چکیده
In this paper we propose algorithms for generation of frequent itemsets by successive construction of the nodes of a lexicographic tree of itemsets. We discuss di erent strategies in generation and traversal of the lexicographic tree such as breadthrst search, depthrst search or a combination of the two. These techniques provide di erent trade-o s in terms of the I/O, memory and computational time requirements. We use the hierarchical structure of the lexicographic tree to successively project transactions at each node of the lexicographic tree, and use matrix counting on this reduced set of transactions for nding frequent itemsets. We tested our algorithm on both real and synthetic data. We provide an implementation of the tree projection method which is up to one order of magnitude faster than other recent techniques in the literature. The algorithm has a well structured data access pattern which provides data locality and reuse of data for multiple levels of the cache. We also discuss methods for parallelization of the TreeProjection algorithm.
منابع مشابه
An Efficient Frequent Pattern Mining Algorithm to Find the Existence of K-Selective Interesting Patterns in Large Dataset Using SIFPMM
Association rule mining in huge database is one of most popular data exploration technique for business decision makers. Discovering frequent item set is the fundamental process in association rule mining. Several algorithms were introduced in the literature to find frequent patterns. Those algorithms discover all combinations of frequent item sets for a given minimum support threshold. But som...
متن کاملFP-Tree Based Algorithms Analysis: FP- Growth, COFI-Tree and CT-PRO
Mining frequent itemsets from the large transactional database is a very critical and important task. Many algorithms have been proposed from past many years, But FP-tree like algorithms are considered as very effective algorithms for efficiently mine frequent item sets. These algorithms considered as efficient because of their compact structure and also for less generation of candidates itemse...
متن کاملIndexed Enhancement on GenMax Algorithm for Fast and Less Memory Utilized Pruning of MFI and CFI
The essential problem in many data mining applications is mining frequent item sets such as the discovery of association rules, patterns, and many other important discovery tasks. Fast and less memory utilization for solving the problems of frequent item sets are highly required in transactional databases. Methods for mining frequent item sets have been implemented using a prefix-tree structure...
متن کاملAn Improvised Frequent Pattern Tree Based Association Rule Mining Technique with Mining Frequent Item Sets Algorithm and a Modified Header Table
In today’s world there is a wide availability of huge amount of data and thus there is a need for turning this data into useful information which is referred to as knowledge. This demand for knowledge discovery process has led to the development of many algorithms used to determine the association rules. One of the major problems faced by these algorithms is generation of candidate sets. The FP...
متن کاملA tree-projection-based algorithm for multi-label recurrent-item associative-classification rule generation
Associative-classification is a promising classification method based on association-rule mining. Significant amount of work has already been dedicated to the process of building a classifier based on association rules. However, relatively small amount of research has been performed in association-rule mining from multi-label data. In such data each example can belong, and thus should be classi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Parallel Distrib. Comput.
دوره 61 شماره
صفحات -
تاریخ انتشار 2001